Semi-Supervised Learning For Sentiment Analysis

نویسندگان

  • John Miller
  • Aran Nayebi
  • Amr Mohamed
چکیده

We leverage vector space embeddings of sentences and nearest-neighbor methods to transform a small amount of labelled training data into a significantly larger training set using an unlabelled corpus. The quality of the larger training set is measured by prediction accuracy on a benchmark sentiment analysis task. Our results indicate it is possible to achieve accuracy within 3-5% of the baseline using only 5-8% the amount of labelled data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

یک چارچوب نیمه‌نظارتی مبتنی بر لغت‌نامه وفقی خودساخت جهت تحلیل نظرات فارسی

With the appearance of Web 2.0 and 3.0, users’ contribution to WWW has created a huge amount of valuable expressed opinions. Considering the difficulty or impossibility of manually analyzing such big data, sentiment analysis, as a branch of natural language processing, has been highly considered. Despite the other (popular) languages, a limited number of research studies have been conducted in ...

متن کامل

Semi-supervised Probabilistic Sentiment Analysis: Merging Labeled Sentences with Unlabeled Reviews to Identify Sentiment

Document level sentiment analysis, the task of determining whether the sentiment expressed in a document is positive or negative, is commonly performed by supervised methods. As with all supervised tasks, obtaining training data for these methods can be expensive and timeconsuming. Some semi-supervised approaches have been proposed that rely on sentiment lexicons. We propose a novel supervised ...

متن کامل

Sentiment Analysis by Augmenting Expectation Maximisation with Lexical Knowledge

Sentiment analysis of documents aims to characterise the positive or negative sentiment expressed in documents. It has been formulated as a supervised classification problem, which requires large numbers of labelled documents. Semi-supervised sentiment classification using limited documents or words labelled with sentiment-polarities are approaches to reducing labelling cost for effective learn...

متن کامل

More Is Better: Large Scale Partially-supervised Sentiment Classication

We describe a bootstrapping algorithm to learn from partially labeled data, and the results of an empirical study for using it to improve performance of sentiment classification using up to 15 million unlabeled Amazon product reviews. Our experiments cover semi-supervised learning, domain adaptation and weakly supervised learning. In some cases our methods were able to reduce test error by more...

متن کامل

Incremental Learning on Sentiment Analysis Using Weakly Supervised Learning Techniques

Due to the advanced technologies of Web 2.0, people are participating in and exchanging opinions through social media sites such as Web forums and Weblogs etc., Classification and Analysis of such opinions and sentiment information is potentially important for both service and product providers, users because this analysis is used for making valuable decisions. Sentiment is expressed differentl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014